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UNTRANSLATED EXON-1 SEQUENCES OF 
EUKARYOTIC GENES AS PROMOTERS 



FIELD OF THE INVENTION 



This invention relates to a novel function of untranslated TATA-less 
first exon sequences in eukaryotic genes, the use of these sequences in 
10 the expression regulation of eukaryotic genes, and for the construction of 
novel expression vectors which provide fine controlling of expression 
levels of specific genes in vitro and in vivo, for use for example in 
human gene dierapy. 

15 BACKGROUND OF THE INVENTION AND PRIOR ART 

Background literature references are referred to herein by way of 
parenthetical numerical citation in the text to the appended bibliography. 
The disclosures of these references are incorporated herein by reference. 

20 

As is well known, all eukaryotic genes contain exons and introns, the 
multifunctions of which in gene regulation processes are still the subject 
of much investigation. A striking fmding in recent progress of human 
genome mapping is the high degree of conservation of introns 

25 sequences, which predict the importance of these large regions whose 
functions are still largely unknown. Another interesting phenomenon is 
that in the case of many eukaryotic genes, their first exons are highly 
conserved, actively transcribed, but untranslated. Remarkable progress 
in the smdy of untranslated first exons has been made during recent 

30 years. For example, the first exons of a number of genes have been 

shown to play roles in transcription regulation; they are required either 
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for the full activity of their promoters, or for interaction with upstream 
cis-elements to regulate the expression level of genes. Those exonic 
control regions have been described for example in the case of the HSV 
tk gene where a ponion of its first exon is required for full activity from 
5 the tk promoter (1), and m the case of skeletal troponin I (sTnl) gene, •* 
where the first exon is required for full muscle-specific activity of the 
sTnl promoter. The conclusion here is that the sTnl exon 1 does not 
contain a transcriptional enhancer, but the interaction between sTnl exon 
1 and the distal upstream region is necessary for the expression 
10 regulatory mechanism (2). Other exonic control regions previously 
described include: RNA polymerase III transcribed rRNAs (3,4) and 
tRNAs (5), heat shock genes (6 - 9), the gastrin gene (10); c-myc is an 
interesting model, and shows it has at least four different promoters. 
The majority of transcripts start at the P, (-161) and P2 promoter (+1), 
15 in normal Burkitt's lymphoma cells (BL), which account for about 10- 
25% and 75-90% of c-myc RNA. In BL cells with the chromosomal 
breakpoint upstream of the gene, c-myc is preferentially transcribed 
from the P, promoter, shifting die ratio of Pj/Pj usage to >1. Notably 
the P2 promoter is located in the c-myc exon 1 region and contains a 
20 potential TATA box (29 - 23) which can be recognized by bodi 

polymerase II & in. The sequences within the P2 promoter affect the 
elongation and premature termination of transcripts initiated from the 5' 
upstream pi promoter. 

25 Two further c-myc promoters^ are the Po promoter which is located 55- 
650 bp upstream of P, and the P3 promoter which is located in die first 
intron of the gene. The potential coding capacity of these P/Pj, Pq and 
P3 RNAs is different. P/P. RNAs contain a single large open reading 
frame and encode two c-myc proteins of 64 and 67 kd translation of the 

30 67 kd protein is initiated at a CTG codon at the end of exon 1, whereas 
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the 64 kd protein is initiated in exon 2, using an AUG codon. However, 
Po RNA contains three open reading frames, the 5' and middle open 
reading frames being initiated upstream of the P, promoter. The 3' 
open reading frame is identical to the P^fPj - 64/67 kd proteins. P3 
5 RNA is initiated at multiple stan sites within the first intron and lacks 
the exon 1 sequences, and is only able to code the 64 kd c-myc protein 
(11 - 14). 

In eukaryotes, transcription is carried out by three different RNA 

10 polymerases: RNA polymerase I, 11 and HI, each of which is dedicated 
to the transcription of different sets of genes. The genes in each class 
contain characteristic promoters, which usually consist of two types of 
functional elements: core (basal) promoter elements and modulator 
(upstream) promoter elements- The core promoter elements are 

15 sufficient to determine RNA polymerase specificity and direct low levels 
of transcription, whereas the modulator elements enhance or reduce the 
basal levels of transcription. The core promoter elements are fu-st 
recognized by specific transcription factors, which then recruit die 
specific RNA polymerase. However recent smdies have discovered that 

20 one general factor, die TATA box-binding protein (TBP), plays a role 
in all three polymerase systems, and the mechanisms of initiation 
complex assembly are strongly conserved. There are thus two 
fundamental mechanisms of initiation complex assembly: for TATA - 
containing promoters (polymerase II and III), TBP bindhig to TATA box 

25 directly 

with its concave DNA-building surface, and its large convex surface to 
which other proteins i.e. activators, general factors, polymerase can 
bind: for non - TATA promoters (all diree polymerases), another 
protein binds to the DNA, i.e. upstream binding factor for polymerase I 
30 family, a transcription factor like SPl for polymerase II family, and 
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TFinC for classical polymerase III promoters. These proteins then 
recruit TBP directly or indirectly via a TBP-associated factor (TAP), 
once the TBP is in the initiation complex. It can then facilitate the 
binding of other proteins (15-17). 

5 

Transcription initiation is one of the most important ways to regulate 
gene expression. Although most protein-encoding genes contain a 
TATA motif, many promoters transcribed by polymerase II, especially 
those of housekeeping genes, lack a TATA element. These are referred 

10 to as TATA-less promoters. In some promoters, a second type of core 
promoter element called a transcriptional initiator (Ini:) has been found 
to mediate the same functions as the TATA motif. An Inr can be 
defined as a DNA sequence element that overlaps a transcription start 
site and is sufficient for determining the start site location in a promoter 

15 that lacks a TATA box and also can cooperate with a TATA box to 
enhance the promoter activity. Various Inr elements have been 
described and classified according to sequence homology, for example, 
the TdT - Inr family, the PBGD - Inr family, the DHFR - Inr family, 
the ribosomal protein - Inr family, the adeno - associated virus p5 - Inr 

20 family etc. Also it has been shown that TFIID is an integral component 
of the transcription initiation complex from almost all TATA - less 
promoters studies. It is strongly suggested that recognition of the Im by 
universal or multiple Inr binding proteins (ITF) might provide a means 
by which a transcription competent complex can assemble (18 - 20). 

25 

Myosin is one of only three proteins known to convert chemical energy 
into mechanical work, i.e. to function as a molecular motor. Each 
myosin molecule is composed of two myosin heavy chains and four 
myosin light chains. All smooth muscle myosin heavy chain (SMHC) 
30 isoforms (SMI, SM2 & SMB) so far have been shown to be transcribed 
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from a single gene by alternative RNA splicing which results in 
divergences at the carboxyl termini (SMI & SM2), or at the 25/50 kD 
junction (SMB). The expression of SMHC gene is strictly controlled in 
a developmental stage - specific and tissue - specific maimer. SMI and 
5 SM2 so far have been found to be specific to vascular and nonvascular 
smooth muscle cells. Smooth muscle cell proliferation in experimental 
arteriosclerosis and atherosclerosis was associated with dedifferentiation 
of the smooth muscle cells toward the embryonic phenotype on the basis 
of the SMHC expression. These existing SMHC isofonns are important 
10 molecular markers in the studies of the process of atherosclerosis as well 
as phenotypic modulation of arterial smooth muscle cells (21 - 26). 

As a result of the characterization of the 5' upstream promoter region of 
the rabbit SMHC gene and the role of untranslated SMHC Exon 1 

15 involved m the tissue - specific regulation of SMHC promoter activity, 
surprisingly we have found that this 78 bp TATA-less SMHC Exon 1 
on its own possesses promoter activity, in the absence of a 5' upstream 
core promoter region, and even in the absence of any 5' upstream or 3' 
downstream sequences. This Exon 1 promoter sequence does not 

20 contain any TATA motif or CAAT motif to provide the second 

promoter capacity as mentioned above in the c-myc gene. This Exon 1 
promoter also does not contain any known initiator sequences, and for 
its promoter activity does not require any 5' upstream region. (As is 
well known, the initiator element usually starts in the 5' upstream region 

25 and encompasses the transcription start site). 

To investigate this new type of Exon 1 promoter, we have synthesized a 

series of mutant oligonucleotides of the SMHC Exon 1 sequences 

Exon 1 (+1-+79), Ex 60 (+1-+61), Ex 40 (+1-+41), and Ex 20 (+1- 
30 +21), cloned into four pCAT vectors (pCATbasic, pCATpromoter, 
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pCATenhancer, and pCATcontrol.; Promega) at either promoter or 
enhancer positions. These vectors were designed specially for testing 
promoter or enhancer activity and contain no other eukaryotic promoter 
or enhancer sequences (pCATbasic), or with SV40 promoter but lacking 
5 any eukaryotic enhancer (pCATpromoter), or with SV40 enhancer but 
lacking any eukaryotic promoter (pCATenhancer) or with both SV40 
promoter plus enhancer as the positive control (pCATcontrol). Six 
differentially positioned pCAT - EXON 1 constructions were used to 
transfect primary rabbit vascular smooth muscle ceils (RSMC). 
10 Transient expression assays showed that in the absence of the 5' upstrea 
region, (including the SMHC core promoter, a strong TATA - promoter 
which can drive CAT gene expression to reach the equivalent level of 
pCATcontrol) or any other cis-acting regulatory elements located in the 
5' upstream or mtron 1 regions of the SMHC gene, the whole SMHC 
15 Exon 1 sequence (79 bp) alone is sufficient to function as an alternative 
promoter, to drive the chloramphenicol acetyltransferase (CAT) reporter 
gene expression m RSMC cells such that the CAT expression level 
reached 45 - 50% of pCATcontrol, (SV40 promoter plus enhancer). 
This SMHC exon I sequence functions as an alternative promoter, but 
20 not enhancer, as the reverse orientation clone of pCATbasic-Exon 1 p 
could not drive CAT gene expression. The SMHC exon 1 sequence 
showed no capacity to interact with SV40 promoter (pCATpromoter- 
Exon 1, Exon 1 -pCATpromoter). Further analysis of the promoter 
activity region located inside of the SMHC Exon I sequence showed 
25 that the alternative promoter activity requires the whole exon 1 region. 
The 5' nested deletion mutants of SMHC exon 1 extending from +1 to 
+21 (Ex20), +41 (Ex40), +61 (Ex60) regions showed no substantial 
promoter activity. It seems that the 3' end of the SMHC Exon 1 
sequence is essential for this alternative promoter activity. Interestingly, 
30 a transcription factor binding motifs search using a GCG program 
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package showed there to be a number of putative transcription factor 
binding sites in the SMHC Exon I region, including: two AP 2 sites 
(activator protein 2)(27) located at +12-+ 19 and +16-+23; three GC- 
rich/GCF sites (28) located at +11— i-17, +61-+67 and +63-+69; one 

5 SIF site (c-sis/PDGF induced, activates c-fos gene) (29) at +45-+50; 
and one E box (bHLH proteins binding concensus site i.e. MyoD 
family) located at +71-+76. It may be that the E box or GC-rich/GCF 
binding sites at the 3' end region of the SMHC Exon 1 is critical to this 
exon 1 promoter activity. To answer this question, we have constructed 

10 E box and GCF site mutants of SMHC Exon 1, the CACTTG motif of 
E box having been changed to GGTTTG (EXMUT 1); and the GCGCC 
central motif of two GC-rich/GCF motifs having been changed to 
AATTT, (EXMUT 2). Transient expression studies showed that these 
single mutations of the 3' end motifs dramatically abolish the Exon 1 

15 promoter activity. This phenomenon distinguishes from all initiators 
reported so far, and requires the whole length of exon 1 to function as 
an alternative promoter. This is the first time an untranslated TATA- 
less first exon has been disclosed to have promoter activity in the 
absence of 5' upstream core promoter sequences. The essential element 

20 for its promoter activity is located at the 3' end of the Exon I sequence, 
but not encompassing the transcription start site as defined for the 
initiator. 

From the view of the multifunctions of highly conserved untranslated 
25 first exons, it seems these actively transcribed but untranslated sequences 
are particularly important regions involved in the transcription 
regulatory mechanisms and pathways which have still to be completely 
understood. It has been shown that these untranslated first exons can 
interact with 5' upstream TATA - promoter or other regulatory 
30 elements to regulate expression levels or tissue-specificity. It has been 

SUBSTITUTE SHEET (RULE 26) 
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shown that these untranslated first exons can contain a second TATA- 
promoier which confers an alternative transcription stan site, or can 
contain partial sequences of initiator which encompass the transcription 
start site. 

5 

As another new function of the untranslated first exons of eukaryotic 
genes, under certain circumstances, for example, in the absence of any 
5' upstream core promoter and other regulatory element sequences as 
well as any 3' flanking sequences, the untranslated first exons can 

10 function as an alternative promoter per se, even without the necessity of 
the presence of a potential TATA - box. The transcription initiation 
complex might be assembled either via specific exon 1 binding proteins, 
or via known general transcription factors, for instance TBP, either 
directly binding, or indirectly binding through certain potential tissue- 

15 specific and gene-specific transcription factors, as we show here that the 
MyoD and GC-rich/GCF motifs play a critical role in the SMHC exon 
1 promoter activity. 

To extend this discovery of a new function of untranslated first exons to 
20 other eukaryotic genes, we have also synthesized a 61 nt oligonucleotide 
containing the whole length of chicken skeletal Troponin I exon 1 
sequences and cloned this into pCATbasic vector, to test its promoter 
activity in mammalian cells. The results show that this untranslated 
TATA-less first exon of sTnl gene also confers moderate promoter 
25 activity in the absence of any 5' upstream sequences including the sTnl 
core promoter region and without 3' flanking (sTnl intron 1) sequences 
as well. 

SUMMARY OF THE INVENTION 

30 
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As the basis for this invention, we have discovered a new function of 
untranslated first exons of eukaryotic genes, without any 5' or 3' 
flanking sequences namely that they function as an alternative promoter 
per se, under certain circumstances, for instance, in the absence of 5' 
5 upstream normal core promoter sequences. 

This exon 1 promoter function has been demonstrated by the synthesized 
79 nt SMHC Exon 1 and 61 nt sTnl Exon 1 oligonucleotides (both 
lacking any 5' or 3* flanking sequences) driving CAT reporter gene 

10 expression in mammalian cells to a moderate level equivalent to 55 

- 60% CAT activity driven by 5' upstream SMHC core promoter or 
about 45-50% CAT activity driven by SV40 promoter plus enhancer 
(pCATcontrol). 

15 To dissect the essential elements in the SMHC exon 1 promoter a series 
of 3' end deletion mutants of the SMHC Exon 1 sequence have been 
made and transient expression assays showed that the 3' end 18 
nucleotides were critical for this exon 1 promoter activity. Further 
mutations of the MyoD motif or GC-rich motif which are located in this 

20 18 nucleotide 3' end region abolished this exon 1 promoter activity. As 
is well known, the MyoD motif CANNTG and GC-rich motif GCGCC 
themselves could not function as a promoter. Hence in one aspect these 
protein binding motifs located inside the first exon can be utilized as 
important components of transcription initiation complex assembled by 

25 the exon 1 promoter. 

This strategy can be an alternative way for highly conserved genes, 
panicularly housekeeping genes, to maintain their basal levels of 
transcription under certain circumstances, for instance, during genomic 
30 rearrangements or gene translocations. The normal gene-specific 
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functional 5' promoter regions are truncated or inactivated, then the 
transcribed first exon becomes the potential initiator which requires the 
whole genetic information to be carried in the gene-specific Exon 1, to 
direct the temperariy expression of these genes. 

5 

In one aspect, the present invention provides the use of SMHC exon 1 
promoter to construct moderate expression vectors, for use in Melody 

Gene Therapy the multiple genes regulation strategy in gene 

therapy, fme controlling of expression levels of concerted genes, and to 

10 restore the proper balance genes interactions. The advantages of highly 
conserved exon 1 promoter are its genetic stability, small size, sequence 
information always being available, easy manipulation and usually 
compact in genetic information. The accimiulation of exon 1 promoter 
information provides a new source for the constraction of new 

15 expression vectors by combining varied motifs, which have been 

naturally organized inside the first exons to earn some gene-specific 
character, such as tissue-specific and differentiation - stage specificities. 

Furthermore, the invention provides the use of eukaryotic genes first 
20 exons sequences in the isolation and characterization of potential 
transcription factors involved in die initiation and regulation of 
transcription, as well as in the elucidation of interactions between 
general and gene-specific transcription factors with eukaryotic 
polymerase enzymes, 

25 

A further aspect of the application of the present invention is as a 
diagnostic tool in the in vitro testing for the response of a patient to 
gene therapy targeted to inhibit vascular smooth muscle cell proliferation 
and phenotype change in atherosclerosis and in intimal hyperplasia. 

30 
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In addition, the SMHC exon 1 promoter can be used as a probe for the 
identification, isolation, characterization and applications of other genes, 
particularly of the myosin family including myosin heavy chain genes. 
These in turn can themselves be used for cell specific expression of 
5 phenotype markers and the expression of genes involved in the 

regulation of cell specific functions, like those of proliferation, migration 
and phenotype determination. 

This invention further provides the use of exon 1 promoter sequences 
10 from eukaryotic genes to construct eukaryotic expression vectors for the 
expression of reporter genes or odier marker genes for transient or long 
term expression smdies in mammalian cell cultures or in vivo animal 
model studies. 

15 This invention also provides the use of these eukaryotic exon 1 mini- 
promoters for gene therapy studies, which avoid the safety issues from 
viral vectors used in clinical trials which have had to be addressed up to 
now. 

20 Further, this invention provides the use of these exon 1 promoter 

sequences to be modified by subtraction or addition or modification of 
sequences for the construction of eukaryotic expression vectors to 
modulate expression levels driven by these new eukaryotic promoters or 
with cell type specific expression, that can be provided in any suitable 

25 form appropriate to the protocol of administration and/or needs of a 

patient undergoing gene therapy or as an in vitro diagnosis tool for use 
in gene therapy. 

BRIEF DESCRIPTION OF THE DRAWINGS 

30 
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FIGURE 1 illustrates the sequence of the SMHC Exon 1 (+1 to +79) 
oligonucleotides synthesized in the 5' to 3' orientation, and the putative 
transcription factors binding sites. 

5 FIGURE 2 illustrates the constructions of pCAT - SMHC Exon 1 

plasmid vectors, in which the SMHC Exon 1 sequence was inserted into 
different positions related to the CAT reporter gene, to test its function 
as promoter, enhancer or cis-regulatory element in mammalian cells. 

J 

10 FIGURE 3 illustrates the promoter activity of pCAT - SMHC Exon 1 
constmcts. RSMC, HSMC and other cell-types were transfected with 
these pCAT - SMHC Exon 1 plasmid DNAs, transient expressed CAT 
activity was measured by liquid scintillation counting (LSC) and the 
transfection efficiency was normalized by co-transfection of 

15 6-galactosidase activity and protein content assay. Relative CAT 
activity is expressed as % of pCATcontrol. 

FIGURE 4 illustrates the constructions of SMHC Exon 1 Promoter 
mutants, mciuding its 3' nested deletion mutants and the MyoD, GC- 
20 rich/GCF motifs mutants. All synthesized mutant oligonucleotides were 
cloned into pCATbasic vector upstream of the CAT reporter gene to test 
their promoter activity in vascular smooth muscle cells. 

FIGURE 5 illustrates the relative CAT activity driven by the SMHC 
25 Exon 1 Promoter mutants of Figure 4 in RSMC and HSMC ceils. 

FIGURE 6 illustrates die synthesized oligonucleotides sequence of sTnl 
Exon 1, the construction of pCATbasic - sTnl EXon plasmid vector and 
its expression in mammalian cells. 

30 
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DFTATLED DESCRIPTION OF THE INVENTION AND 
PREFERRED EMBODIMENTS 

5 In relation to Figures 1 - 6 the following materials and methods were 
used. Materials: ['''C] chloramphenicol (40 - 60 mCi/mmol) and [a - 
35 S] - dATP were purchased from Amersham. Enzymes used in 
plasmid construction were obtained from Boehringer Mannheim 
Biochemicals and Promega. Plasmids vectors pCATbasic, 
10 pCATpromoter, pCATenhancer, pCATcontrol and pSV-pGal were from 
Promega. Bacterial chloramphenicol acetyltransferase (CAT) and acetyl 
coenzyme A were from Pharmacia Inc. 

Cell culmres: rabbit smooth muscle cells (RSMC) and rabbit endothelial 
15 cells (RENDO) were obtained from the dioracic aorta of 8 to 10- weeks 
old rabbits as described (30-31). Human aonic or vein smooth 
muscle cells (HSMC) were prepared by explantation of aortic or vein 
tissues obtained during cardiovascular surgery as described (32). 
Human dermal fibroblasts (HDF) were obtained by enzymatic digestion 
20 of tissues as described (33). Human Girard heart cells (HGH) were 
obtamed from ICN Biochemicals Ltd. All cells were maintained in 
DMEM medium supplemented with 10 - 20% fetal calf serum, 2niM 
L-glutamine, 0.25 ug/ml fungizone, 100 u/ml penicillin and lOOug/ml 
streptomycin (all from GIBCO), at 37°C and 10% CO,. However 
25 RENDO cells were maintained in M199 medium supplemented with 
20% fetal calf serum, 2mM L-glutamine, 0.25 ug/ml fungizone, 
lOOu/ml penicillin and lOOu/ml streptomycin, plus 20 ug/ml endothelial 
cell growth factor and 80 ug/ml heparin, within gelatin pre-coated flasks 
at 37°C and 7% CO,. 
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Sequence computering analysis: The SMHC gene and sTnl gene 
sequences obtained from Genbank database. Totally 162 transcription 
factors binding sites (34) were searched using the GCG program 
package (via UNIX system, SEQNET, Daresbury Laboratory, UK). 
5 DNA sequence analysis: Sequence analysis of the recombinant piasmids 
was performed by dideoxy chain termination methods (35) using 
Sequenase (USB). All constructions have been confirmed by 
sequencing. 

10 Piasmids constmction: Equal amounts of the synthetic oligonucleotides 
were kinase treated and purified by phenol extraction - ethanol 
precipitation. dsDNA fragments were obtained by hybridizing the upper 
strand and lower strand oligonucleotides mixed in hybridization buffer 
(lOmM Tris-HCL, pH7.5, 150 mM NaCl, lOmM MgCy using single 

15 cycle of a PGR instrument (3min/97°C, 15min/65°C, 15min/37^C and 
15min/24'*C). These dsDNA fragments were cloned into pCAT vectors 
by appropriate restriction enzyme digestion and blunt-end ligation at the 
desured insert positions as indicated below. After transformation into 
DHja competent cells, selected recombinants were confirmed by 
20 restriction enzyme mapping and DNA sequencing. CsCl gradient 
method (36) was used to purify the plasmid DNAs for transfection 
studies. 



(1) Exon 1 - pCATbasic: SMHC Exon 1 sequence (+1— h79) was 
25 inserted into Sal I site, upstream of the CAT gene, at the promoter 

position. pCATbasic vector contains pUC19 backbone, poly linker site, 
CAT gene, SV40 small Tantigen (terminal signal) and ampiciilin 
resistance gene. It lacks any eukaryotic promoter and enhancer 
sequences. 
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(2) pCATpromoter - Exon i: SMHC Exon I sequence (+1— !-79) was 
inserted into Sal I site, downstream of CAT gene and SV40 terminal 
signal, at the enhancer position. pCATpromoter vector contains SV40 
promoter but lacks any eukaryotic enhancer sequence. 

5 

(3) Exon 1 - pCATpromoter: SMHC Exon 1 sequence (+1-+79) was 
inserted into Bgl II site, upstream of SV40 promoter and CAT gene at 
the cis-regulatory element position. 

10 (4) Exon 1 - pCATenhancer: SMHC Exon 1 sequence (+1-+79) was 
insened into Sal I site, upstream of CAT gene, at the promoter position. 
pCATenhancer vector lacks any eukaryotic promoter sequence, but 
contains SV40 enhancer which was positioned downstream of CAT gene 
and SV40 tenninai signal. 

15 

(5) Exon 1 - pCATcontrol: SMHC Exon 1 sequence (+1 - +79) 
was inserted at Bgl II site, upstream of SV40 promoter and CAT gene, 
at the cis-regulatory element position. pCATcontrol contains SV40 
promoter plus enhancer which can drive a high level of CAT expression 

20 in eukaryotic cells. 

(6) pCATcontrol - Exon 1: SMHC Exonl sequence (+1 - +79) 
was insened into Sal I site, downstream of SV40 enhancer, at the 
enhancer position. 

25 

(7) Ex 20 - pCATbasic: 3' end deletion mutant 
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partial SMHC Exon I sequence (+1 - +21) was inserted into Sal I site 
of pCATbasic vector, upstream of CAT gene, at the promoter position. 

(8) Ex 40 - pCATbasic: 3' end deletion mutant 

5 paniai SMHC Exon 1 sequence (+1 - 441) was inserted into Sal I site 
of pCATbasic vector, upstream of CAT gene, at the promoter position. 

(9) Ex 60 - pCATbasic: 3 ' end deletion mutant 

partial SMHC Exon 1 sequence (+1 - +61) was inserted into Sal I site 

10 of pCATbasic vector, upstream of CAT gene, at the promoter position. 

(10) Exoniop - pCATbasic: SMHC Exon 1 sequence (+79 - +1) 
was inserted into the Sal I site of pCATbasic vector, but in the reverse 
orientation, at the promoter position. 

15 

(1 1) EXMUTl - pCATbasic: MyoD motif mutant of SMHC 
Exon 1 sequence (+1 - +79), its MyoD motif CACTTG (+71 - +76) 
was replaced by GGTTTG). Mutated SMHC Exon I sequence was 
inserted into Sal I site of pCATbasic vector, upstream of CAT gene, at 

20 the promoter position. 

(12) EXMUT2 - pCATbasic: GC-rich/CGF motif mutant of 



BNeOOCID: <WO_94e046aA1JL> 



SUBSTITUTE SHEET (RULE 26) 



wo 94/29468 




PCT/GB94/012S1 



SMHC Exon 1 sequence (+1 - +79), Two of the GC-rich/GCF motifs 
located at its 3' end GCGCGCC (+60 - +66) & GCGCCCC (+62 - 
+69) were mutated by change the central GCGCC motif (+62 - +66) 
into AATTT. The mutated SMHC Exon 1 sequence was inserted into 
5 the Sal I site of pCATbasic vector, upstream of the CAT gene, at the 
promoter position. 

(13) EXMUT2op - pCATbasic: The same insert sequence and 
position as mentioned above, but was cloned in the reverse orientation. 

10 

(14) sTnl Exonl - pCATbasic: sTnl Exon 1 sequence (+1 - 
+61) was inserted into the Sal I site of pCATbasic vector, upstream of 
the CAT gene, at the promoter position. 

15 Cell transfection: All cells were transfected by electroporation method 
(37), using BIO-RAD gene pulser system. 20-40 ug plasmid DNA was 
electroporated at 260 V/960 uF (RSMC, HSMC, HGH, HDF) or 230 
V/960 uF (RENDO) with -10* cells suspended in 0.5 ml electroporation 
buffer. After 48-60 hr, the transfected cells were harvested for CAT 

20 activity assays. The transfection efficiency was normalized by co- 
transfection of B-Gal expression vector (pSV-BGal). 
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CAT enzyme activity: CAT activity was determined as described 
(38) with some modifications. Briefly, ceil extracts in 0.25 M Tris-HCl 
pH8.0 obtained by freeze tiiaw were heated at 60°C/I0min for the 
inactivation of endogenous acetylase. Aliquotes of 1 15 ui cell lysace 
5 were incubated with 5 ui of "*C-chloramphenicol (Amersham, 0.025 

mCi/ml) and 5 ul of n-butyryl coenzyme A (5ug/ml, Sigma) at 37°C for 
60 min. The reaction products were extracted with 300 ul mixed 
xylenes (Aldrich), the xylene phase was back-extracted with 0.25 M 
Tris-HCl pH8.0 twice to remove all unreacted chloramphenical, and 
10 use for liquid scintillation counting (LSC). 

fi-Gal enzyme assay: Co-transfected fi-Gal activity in Cell lysates 
(without heat inactivation) was determined as previously described (39). 

15 Protein content assay: The protein content of cell lysates was 

determined using the protein assay kit (BIO-RAD) based on the method 
of Bradford (40). 
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CLAIMS 

1. The nucleotide seuqence from +1 to +79bp (Exon 1) of the rabbit 
smooth muscle myosin heavy chain gene or the sequence from +1 to 

5 -+-61 bp (Exon 1) of the chicken skeletal Troponin I gene. 

2. The nucleotide sequence of claim 1 for use as a promoter in the 
expression of one or more genes, esepcially reporter genes, in a 
mammalian host cell. 

10 

3. The nucleotide sequence of claim 1 for use in the constmction of 
a eukaryotic expression vector comprising a reporter or marker gene. 

4. A eukaryotic expression vector comprising the nucleotide 
15 sequence of claim 1, or a portion or mutant thereof. 

5. An expression vector according to claim 4, selected from any of 
these vectors disclosed herein with reference to the accompanying 
drawings. 

20 

6. A plasmid comprising the nucleotide sequence of claim 1, or a 
portion or mutant thereof. 
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7. A plasmid according to claim 6, selected from any of those 
piasmids disclosed herein with reference to the accompanying drawings. 

8. A manmialian host cell transfected with a gene comprising the 
5 sequence of claim 1, or a portion or mutant thereof, as the promoter 

therefor. 

9. A cell according to claim 8, which is selected from rabbit or 
human or other vascular smooth muscle ceils, human dermal fibroblasts, 

10 human Girard heart ceils, rabbit skin fibroblasts, rabbit endothelial cells, 
rabbit kidney epithelial cells or rat skeletal muscle myoblasts. 

10. A method of initiating and/or regulating expression levels and/or 
tissue specificity of a gene in a mammalian host cell by use of a 

15 promoter therefor which comprises the nucleotide sequence of claim 1 
or a ponion or mutant thereof. 

11. A method according to claim 1 0 wherein the host cell is a human 
cell. 

20 

12. A method according to claim 10, wherein the said gene is that 
which codes for chloramphenicol acetyl transferase (CAT) or B- 
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galactosidase or the firefly luciferase gene. 



13. A method according to claim 10, wherein the host cell is a rabbit 
or human or other vascular smootii muscle cell, human dermal 

5 fibroblast, human Girard hean cell, rabbit skin fibroblast, rabbit 
endothelial cell, rabbit kidney epithelial cell or rat skeletal muscle 
myoblast 

14. Use of the nucleotide sequence of claim i, or a ponion or mutant 
10 thereof, as a promoter in the expression of one or more genes, especially 

one or more reporter genes, in a mammalian host cell. 

15. Use of the nucleotide sequence of claim I, or a ponion or mutant 
thereof, in the construction of a eukaryotic expression vector for the 

15 transient or stable expression of a foreign gene in a mammalian host 
cell. 

16. Use of the nucleotide sequence of claim I, or a portion or mutant 
thereof, in the construction of a vector comprising one or more marker 

20 genes for vascular SMC phenotype diagnosis. 

17. Use of the nucleotide sequence of claim I, or a portion or mutant 
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thereof, as a gene probe. 



18. Use of the nucleotide sequence of claim 1, or a ponion or mutant 
thereof, in human gene therapy. 

5 



BNSOOCCD: -<WO_M2M6aAlJL?' 



SUBSTITUTE SHEET (RULE 26) 



wo 94/29468 




PCT/GB94/01251 



1/6 



o 
m 



o 

fsj 



CM 
< 



O 
»— 

ID 
< 

O 
ID 



ID 



ID < 

•r ID 




LD 



ID 

< 

ID 
ID 



ID 
< 



O LD 
ID 

< 
ID 



LD 



ID 
ID 

vO LD 



X 

o 



LD 



LD 



NO 

CO nOCVICSJ 
Osl Ln CMpgi— 4 
^ CvJ CN4,^ fX) 



LL 




so 



o 

X 



BN800CII): ^_jMa*«a*1JL> 



SUBSTITUTE SHEET (RULE 26) 



wo 94/29468 





PCT/GB94/01251 



2/6 



_>0 LOCO 



(U 

(= 
TO 



!r -° > <=■ 



-4- 




o 
O 

LL 




mcsi 

Osl CM CVJ <Ni _ 

. cj — '-0*5 



o 

X 





BhOOOCCD: <MO_M2«46M1JL^ 



SUBSTITUTE SHEET (RULE 26) 



wo 94/29468 




PCT/GB94/01251 



O 
X 

UJ 



3/6 



c^csj^ <=> 2 
-^-S 

u) <r CO X ^^^.-^ — 



O 
O 

LL 





BNSOOCtO: <WO_JMS046M1JL> 



SUBSTITUTE SHEET (RULE 26) 



wo 94/29468 




PCT/GB94/01251 



4/6 



I/) 

1— r 



CL O 



X)^ r- a. 
< o o < 



o ^ ^ 

c t ; ro i_j -4— 

t- , C i p 

CLUJ CLUJ 



c 
o 



c 
o 

X 



c 
o 

X 

O 
L- 

o 



CP 



ai 



00 

3: 



00 



o o o o o 

O GO sO ^ ^ 



O <0 O CD O 

<r> oo ^ rNi 



o o o 

O OD sO 



CV4 



O O <3 
CD CO ^ 



CD 
Csl 



CO 
LL 



o 
o 



8W80OCCD: <Wro_jMaM«aA1JL> 



SUBSTITUTE SHEET (RULE 26) 




wo 94/29468 ^ ^ PCT/GB94/01251 

5/6 



Fig.4. 
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